# AI-Ready Data Guide — GDP Evolution in Latin America (2000-2024)

## Overview
This dataset has been optimized for AI/ML workflows. All 6 data tables are available in 3 formats:
- **CSV** (clean, snake_case headers, UTF-8, no BOM)
- **JSON** (records format with embedded schema metadata)
- **Parquet** (columnar, Snappy-compressed, strongly typed)

---

## Quick Load Examples

### Python / pandas
```python
import pandas as pd
import pyarrow.parquet as pq

# CSV
df = pd.read_csv("GDP_LatinAmerica_2000_2024_clean.csv")

# JSON
df = pd.read_json("GDP_LatinAmerica_2000_2024.json")
df = pd.DataFrame(df["data"])  # records are under "data" key

# Parquet (fastest, typed)
df = pd.read_parquet("GDP_LatinAmerica_2000_2024.parquet")
```

### R
```r
library(arrow)
library(jsonlite)

# CSV
df <- read.csv("GDP_LatinAmerica_2000_2024_clean.csv")

# Parquet
df <- read_parquet("GDP_LatinAmerica_2000_2024.parquet")
```

### DuckDB (SQL analytics)
```sql
-- Direct Parquet query (no load needed)
SELECT * FROM read_parquet('GDP_LatinAmerica_2000_2024.parquet');

-- All files at once
SELECT * FROM read_parquet('GDP_*.parquet');
```

### HuggingFace Datasets
```python
from datasets import load_dataset
ds = load_dataset("parquet", data_files="GDP_LatinAmerica_2000_2024.parquet")
```

---

## File Index

| Table | CSV | JSON | Parquet | Rows | Description |
|-------|-----|------|---------|------|-------------|
| GDP_LatinAmerica_2000_2024 | ✅ | ✅ | ✅ | 7 | Nominal GDP (billions USD) |
| GDP_Annual_Growth_LatinAmerica | ✅ | ✅ | ✅ | 11 | Annual growth rates (%) |
| GDP_PerCapita_LatinAmerica_2000_2024 | ✅ | ✅ | ✅ | 11 | GDP per capita (USD) |
| GDP_economic_structure_latam | ✅ | ✅ | ✅ | 11 | Sector contributions (%) |
| GDP_global_share_latam | ✅ | ✅ | ✅ | 11 | Global GDP share (%) |
| references_gdp | ✅ | ✅ | ✅ | 18 | Bibliography |

---

## Format Recommendations

| Use Case | Recommended Format |
|----------|--------------------|
| Exploratory analysis | CSV |
| LLM / RAG pipelines | JSON |
| ML training / big data | Parquet |
| SQL analytics (DuckDB, Spark) | Parquet |
| Web APIs | JSON |
| Excel / Sheets import | CSV |

---

## Schema Notes
- All CSV files use **snake_case** column names, UTF-8 encoding, comma delimiter
- JSON files include a `schema` block with field names, original names, and types
- Parquet files use **Snappy compression** and strongly typed columns (Float64/Int64/Utf8)
- Country names in English (ISO-compatible); `iso3` column available where applicable

---

## Provenance
- Original data: World Bank WDI, IMF WEO, ECLAC statistics
- AI-ready formats generated: 2026-03-21
- License: CC0 1.0 (Public Domain)
